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On the relationship between the number of ovules formed and the 
number of seeds developing in Cercis 

J. Arthur Harris 
(with three text figures) 

I. Introductory Remarks 

In an earlier paper* I have stated certain problems concerning 
the relationship between the number of ovules laid down and the 
capacity of the ovary for maturing its ovules into seeds, and have 
illustrated the methods which seem suitable to me for their 
solution by a series of data drawn from experimental cultures of 
Phaseolus vulgaris. The results of this first analysis of extensive 
series of data seem to render desirable the like treatment of other 
similar but quite distinct materials. The present paper is, there- 
fore, devoted to the analysis of numerous data from a wild small- 
seeded arborescent legume, Cercis canadensis. 

The study has been in progress since the autumn of 1905, when 
the first large series of countings was made. The results eiven in 
this paper were made ready for the press in January 1910, but 
the manuscript was laid aside in the hope that it would be possible 
to secure data which would show the relationship between the 
correlations discussed and the then just discovered selective 
mortality of ovaries. In this hope I have met with only disap- 
pointment, and it seems best to withold the materials no longer. 

II. Materials 

The materials here analyzed were collected in three series as 
follows: 

A. A very large collection taken at Meramec Highlands, near 
St. Louis, Missouri; altogether 28,554 pods. 

B. A collection from 22 trees in the neighborhood of Lawrence, 
Kansas; 2,200 pods. 

* Harris, J. Arthur. On the relationship between the number of ovules formed 
and the capacity of the ovary for developing its ovules into seeds. Bull. Torrey 
Club 40: 447-455. Au 1913. 

243 



244 Harris: Relationship of ovules to seeds 

C. A collection from 26 trees near Sharpsburg, Athens County, 
Ohio; 3,900 pods. 

The pods of Cercis canadensis are, like those of many other 
Leguminosae, somewhat unsatisfactory for investigations of 
fertility because of the difficulty of drawing a sharp line between 
ovules which fail to develop and those which form perfect seeds. 
It seems unfeasible, in the present state of our knowledge of these 
matters, to adopt more than the two categories, abortive ovules 
and matured seeds. In most cases, an observer will have little 
difficulty in determining to which class an individual ovule should 
be assigned. Nevertheless, we are dealing here with characters 
not perfectly discontinuous. This condition must always be 
borne in mind in considering the trustworthiness of our constants. 

In counting, we considered as abortive ovules only those which 
had not developed at all, or only slightly, beyond the stage at- 
tained in the very young pod. Some of the seeds counted as 
matured were probably not well enough developed to be viable. 
Some of them were light and apparently blighted. The cause of 
this I do not know. The ovules failing to develop are not as 
easily made out in the mature pods of Cercis as they are in some 
other Leguminosae ; this increased somewhat the labor of counting. 

III. Discussion of Data 

A. The Meramec Highlands Collections 

The correlation between the number of ovules formed and the 
number of seeds developing in a first large sample of 6,000 pods, 
chiefly from a few (6 or 8) large trees growing closely together is 
shown in Table I. For comparison another 4,000 was gathered 
quite at random in the immediate neighborhood of the trees 
furnishing the first 6,000, but from a larger number (probably 25) 
of smaller trees: Table II gives the correlation surface. 

The fundamental physical constants appear in Table III. 

For the means the differences and probable errors of the dif- 
ference of these two samples are 

Ovules +.1536 ±.0137 

Seeds —.0261 ±.0166 
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The ovules are about fifteen one-hundred ths more numerous in 
the sample collected from the few large trees. This is certainly 
not a difference which would have been detected by other than 
biometric methods, and some might consider it neglible, but it is 
slightly over eleven times its probable error and so unquestionably 
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TABLE III 
Physical Constants for Random Samples 



Constants 

Mean of ovules 

Standard deviation of ovules 
Coefficient of variation of 

, ovules 

Mean of seeds 

Standard deviation of seeds. . . 
Coefficient of variation of seeds 
Correlation, ovules and seeds . 



6,000 Pods 



4,000 Pods 



10,000 Pods 



4.7498 ±.0080 
.9274 ±.0057 

19.5270 

3.7786 ±.0102 

1.1781 ±.0072 

31.1792 

.5763 ±.0059 



4.5952 ±.0112 
1.0532 ±.0079 

22.9215 

3.8047 ±.0131 

1.2377^-0093 

32.5332 

.6782 ±.0057 



4.6880 ±.0062 
.9826 ±.0046 

20.9615 
3.789I ±.008l 
1. 2024J=.0057 

31-7337 
.6133 ±.0042 
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significant. No importance need to be attached to the difference 
for seeds developing, which is not twice its probable error. 

The difference for the standard deviations and coefficients of 
variation are: 

Standard Deviation Coefficient of Variation 

Ovules —.1258 ±.0097 —3-395 

Seeds —.0596 ±.0117 — 1.354 

The difference in S.D. for ovules is nearly 13 times its probable 
error and for seeds about 5 times its probable error. Both are 
clearly significant. 

The difference for the coefficient of correlation is 

r, — .1199 ± .0082, 

a difference 13.5 times its probable error and undoubtedly sig- 
nificant. 

It appears, therefore, that our samples are sensibly differen- 
tiated from each other in type, variability and correlation. This 
fact is sufficient ground for considering their correlations inde- 
pendently. 

The significance of the coefficient of correlation depends upon 
linearity of regression. Using the familar equation for the re- 
gression straight line 



I find 



(<7S \ <TS 
s — r — 1 r — 
co ) ao 



For first 6,000 s = -3554 +.7207 o 

For additional 4,000. . s = .1423 +.7970 o 

For first 10,000 s = .2712 +.75040 

The closeness of agreement of the observed means and those 
given by the equation is evident from Table IV where the two are 
compared. If the two extreme variates, where the numbers of 
observations are so small that little weight is to be attached to 
them, be omitted, there is only a single case out of the eighteen 
where the deviation of the observed from the theoretical mean 
reaches thirteen one-hundredths of a seed. 

The average (weighted) deviations (disregarding signs) of the 
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•TABLE IV 

Deviation of Observed Means of Arrays from Theoretical Means as Calcu- 
lated from the Regression Straight Lines 
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empirical from the theoretical means is .0198 for the first 6,000, 
.0411 for the additional 4,000, and .0302 for the total 10,000 pods. 
The fit is also shown graphically for the 6,000 pod lot in fig. I. 
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For the 6,000 nd the 10,000 lots, I have calculated ij as well 
as r. I find 
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Constant Series of 6,000 Pods Series of 10,000 Pods 

Coefficient of correlation, r 5673 ±.0059 .6133 ±.0042 

Correlation ratio, r/ 5677 ±.0059 .6140=*= .0038 

Difference, t; — r 00035 .00066 

r/ 2 — r 2 = f 00040 .00081 

Note the exceedingly small differences between r and ij. For a 
scientific test of linearity, we have recourse to the constant f as 
suggested by Blakeman,* *. e. : 

l7 ^ 1 /"=■ 

^'' 5 

which gives 

For 6,000 pods f/Ef = 1.144 

For 10,000 pods £/£f = 2. no 

Hence regression may be considered linear within the limits of the 
probable errors of random sampling. 

The reason so much stress has been laid upon the question of 
the nature of regression is two-fold. First, the validity of the 
correlation coefficient as a description of the relationship between 
the number of ovules formed and the number of seeds developing 
depends upon linearity of regression. Second, it is a matter of 
considerable biological importance to know that the rate of change 
in the number of seeds developing per pod remains constant from 
one end of the range of variation of number of ovules per pod to 
the other. 

The coefficient which measures the relationships between the 
number of ovules per pod and the capacity of the pod for maturing 
its seeds is not r oa but r oz . The results are : 

For the first 6,000 r ot = — .07i4±.oo87 

For the additional 4,000 r z = — .0358 ±.0106 

For the whole 10,000 r„ z = — .0597 ±.0067 

The first and third constants are clearly significant statistically 
deviating from o by about 8 or 9 times their probable errors: the 
second constant may also be significant but it differs from o by 
only about 3.5 times its probable error. They indicate that the 
pods with the larger number of ovules are not as capable of matur- 

* Biometrika 4: 332-350. 1905. 
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ing their ovules into seeds as those which do not produce so many, 
but that the relationship is a very slight one. 

The differences between two random samples taken so closely 
together and the very low correlation makes one cautious in 
accepting our constants as biologically significant for Cercis as 
a species, or even for Cercis as a race growing at Meramec 
Highlands. Under the circumstances the only thing to be done 
is to collect wider series of data. 

The collection of this additional material was carried out from 
two standpoints; first that of widening the sources of pods in 
number of trees and variety of habitats, second, that of securing 
greater homogeneity in the series of pods upon which individual 
constants are based by taking them all from individual trees. 
The discussion of the results of analysis of data for the individual 
trees must be reserved for a later contribution. In addition to 
the general samples just described from Meramec Highlands, 
smaller lots were taken from about 125 trees. These can be 
added to the 10,000 pods already discussed. 
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Consider now the total material, amounting to 28,554 pods, 
from Meramec Highlands. The data appear in Table V. The 
constants are: 



A = 4.7020 ± .0040, 
Q = 1.0074 =fc -0028, 
V = 21.425, 
r 03 = .6482 =*= .0023, 



A. = 3-8399 * -0048, 

<T S = 1.2036 * .0034, 
Vs = 31-3452. 

r„z = — .0463 ± .0040. 
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The equation to the regression straight line is 

5 = .1983 + .7745 0. 

The means and the fitted line are seen in fig. 2. Except for the 
final class, 8, the agreement of the predicted and the observed 
means is excellent, so close, indeed, that it is impossible to represent 
it graphically on a diagram of the size to be published on our page. 
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Fig. 2 



To test more critically the linearity of regression, I determined 
the correlation ratio, ??, and compared it with the coefficient of 
correlation r. I find 

t] = .648443, 

r = .648226, 

7] — r = .000217, 



a very close agreement indeed. 
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Applying Blakeman's test* and using this time his more 
exact formula, 

L = J^L i wj • * 

£ f .67449 ' 2 s ' V 1 + (1 - v *y - (I- r *y 



I find 
which gives 



if — r 2 , = T, = .000281, 



{/E ( = 2.101. 

As far as one can state with certainity, therefore, the deviations 
of the observed means from the straight line due to the equation 
may be due to nothing more than the probable errors of random 
sampling. Comparing this diagram with those for other series 
of Cercis, I think it not unlikely that the falling off in the mean 
number of seeds for the pods with 8 ovules is biologically sig- 
nificant. 

Here again, the sign of the correlation, r oz , is negative and its 
value very small. However r 0! /Er z = 11.63, and perhaps it is a 
significant relationship. 

It may have occurred to the reader that the negative relation- 
ship between the number of ovules per pod and the capacity of 
the pods for maturing their seeds may be due to some purely 
mathematical difficulty in dealing with the biological data — 
perhaps to some approximation in the formula. 

To reassure those who may be skeptical on this ground, I have 
actually determined the deviation of each number of seeds per 
pod from the probable number which would have occurred if 
fecundity had been the same throughout all the population and 
have determined the correlation between these deviations and 
the number of ovules per pod. 

In calculating the probable number of seeds for each pod the 
total seeds matured for all the pods was divided by the total 
ovules formed to get P, the probability of any ovule in the entire 
population — i. e., irrespective of the number of ovules in the pod 
in which it occurred — developing into a seed. This was 

P = 109,644/134,261 = .816,648. 

* Biometrika 4: 350. 1905. 
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To obtain the probable number of seeds developing in any class 
of pods the number of ovules is multiplied by P. 

In carrying out the arithmetic of the calculation of r oz by the 
"brute force" method all the deviations were written down to 
six decimal places. The values of r oz are: 

Calculated by formula —.046315 

Calculated by "brute force" —.046311 

Difference 000004 

Comments are superfluous. 
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Having actually obtained the deviations of the fertility of the 
individual pods from the probable fertility, it is easy to calculate 
the mean deviation for the arrays associated with different 
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numbers of ovules per pod. The standard deviation may also 
be computed for the entire material. The slope of the regression 
line is 

z = .19833 - .04217 0. 

The slope of this line and the empirical mean deviations are shown 
in fig. 3. 

It thus appears that while the relationship is an exceedingly 
slight one, in all the adequately large series of material from 
Meramec Highlands the capacity of the pods for maturing their 
seeds decreases as the number of ovules per pod increases. 

B. Analysis of The Data from the Vicinity of Lawrence, Kansas 

I have to thank myfather, Mr. J. T. Harris, for the collection of 
a series of 100 pods each from 22 trees in the neighborhood of 
Lawrence, Kansas. All the trees grew in the same small field. 

The correlation between the number of ovules per pod and 
the number of seeds developing per pod for the total material 
from the 22 individuals is set forth in Table VI. The results are: 

A„ = 4.916 =t .013, A„ = 4.116 ± .015, 

Co = .925 =*= .009, <r s = 1.030 ± .010, 

Vo = 18.82, V, = 25.03, 

r . = .603 ±.009, r oz = - .183 ± .014. 

For the lumped conclusions, where N = 2,200, I find: 

TozlEfoz = 13-19- 

C. Analysis of the Data from the Vicinity of Sharpsburg, Ohio 

I am indebted to my grandfather, Mr. J. W. Harris, for the 
collection of 150 pods each from a series of 26 trees growing in the 
neighborhood of Sharpsburg, Athens Co., Ohio. 

Calculating from the grand total of nearly 4,000 pods summar- 
ized in Table VII, I find: 

Ao = 5-493 * -on, A, = 3.944 ± .016, 

<ro = .983 = fc -008, a, = 1.453 ± -on, 

Vo = 17-89, V. = 36.83, 

fo, = -455 =*= -009, r oz = — .034 ± .011. 
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Here r 0Z IEr i = 3.15. Possibly this value is statistically sig- 
nificant, but considering its extremely small magnitude I think one 
should be cautious in attaching any biological significance to it. 
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D. Comparison of Constants from Three Series of Cercis 

A detailed comparison of the characters of red bud from various 
regions of the United States or from different habitats falls outside 
the scope of this paper. I will, however, lay the results from the 
total materials of the three side by side for a casual comparison 
in Table VIII. 

It is clear without further arithmetic that many of these con- 
stants differ significantly from series to series, that is to say, the 
difference between them is several times as large as can be attrib- 
uted to the errors of sampling from a homogeneous population. 
This fact does not, however, necessarily indicate that the three 
series are "genetically," "racially" or " genotypically " distinct. 
Each general collection is composed of a (relatively) small number 
of trees. These individuals are, as will be shown later, differenti- 



Harris: Relationship of ovules to seeds 



255 



ated in number of ovules and number of seeds per pod, and prob- 
ably in the variability and correlation of these two characters. 
Since the general samples are made from a relatively small number 
of these individual trees some differences between collections 
might arise through the errors of sampling in the selection of 
individuals. 

TABLE VIII 
Comparison of Three Series of Cercis 



Missouri Series 



Kansas Series 



Ohio Series 



Total individuals 

Total pods 

Ovules 

Mean 

Standard deviation 

Coefficient of variation . 
Seeds 

Mean 

Standard deviation 

Coefficient of variation . 
Ovules and Seeds 

Correlation, r s 

Correlation, r x 

Coefficient of fecundity 



More than 123 
28,554 

4.702 ±.004 

1.007 ±.003 

21.43 

3.840 ±.005 

1.204 ±.003 

31-35 

.648 ±.002 
— .046 ±.004 
.8166 



22 
2200 

4.oi6±.oi3 

.925 ±.009 

18. 82 

4.ii6±.oi5 

1.030 ±.010 

25-03 

.603 ±.009 
— .i83±.oi4 
.8373 



26 
3900 

5.493 ±. 01 1 

.983 ±.008 

17.89 

3.944 ±.016 

1.453 ±.011 

36.83 

.455 ±.009 
— .034±.on 
.7180 



Again, no account whatever can be taken of environmental 
conditions, either edaphic or meterological. 

With two such factors, which may to some extent tend to 
bring about differences in the constants of the series dealt with, 
it has seemed to me rather surprising that the physical constants 
for ovules and seeds do not differ more widely than they do. 

The coefficients of correlation, r, differ considerably; but two 
factors influencing this constant must not be forgotten. First, 
heterogeneity, due to the mixing of the pods from a large number 
of individuals, would tend to raise the value for the Missouri 
series. This appears very clearly in a comparison of the mean 
for the 60 individual constants from trees with ioo pods each and 
the constant for the 28,000 and more pods in the lumped sample. 
The former is .599, the latter .648. Second, the coefficients of 
fecundity show that the three series differ very materially in the 
percentage of ovules developing into seeds. The lowest value of 
the coefficient of correlation for ovules formed and seeds maturing 
(in the Ohio series) is associated with the lowest value of the coef- 
ficient of fecundity. 

Finally, perhaps the most important point to be gathered from 
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this table is that in all three series r oz is negative and of a very 
low order, but quite possibly significant even in the Ohio series. 

E. Summary and Discussion 

The foregoing pages embody the results of an attempt to 
ascertain the relationship between the number of ovules per pod 
and the capacity of the pod for maturing its ovules into seeds in 
the leguminous plant Cercis canadensis. The methods of analysis 
are those of an earlier paper on Phaseolus. The data in hand 
lead to the following conclusions : 

The correlations for number of ovules formed and number of 
seeds developing per pod, r ots , have always been found positive and 
of a moderate, considerable or even high intensity. 

Regression of number of seeds on number of ovules per pod is 
sensibly linear in a population of pods from many individual trees. 
Possibly, however, there is a departure from linearity in the pods 
with eight ovules; in my largest series there are only 36 of these 
pods out of a total of 28,554, an d this number is too small to be 
given great importance. 

The significance of the linearity of regression is two-fold. 
Statistically, it justifies describing the interdependence between 
the number of ovules formed and the number of seeds maturing 
by the coefficient of correlation. Biologically, it shows that the 
rate of increase in number of seeds developing per pod remains 
the same as we pass from pods with the lowest to pods with the 
highest numbers of ovules. 

Wherever large series of pod have been examined, the corre- 
lation between the number of ovules per pod and the capacity of 
the pods for maturing their seeds, r oz , has a negative sign arid a 
low, usually a very low, magnitude. For every large series ex- 
amined the value of r oz has been over 2.5 times its probable error. 
These evidences can leave little doubt of the existence of a slight 
negative relationship between the number of ovules formed and 
the capacity of the pod for maturing its ovules into seeds, the pods 
with the larger number of ovules producing relatively fewer seeds. 

In a subsequent paper, these conclusions will be tested upon 
the more homogeneous collections of pods from individual trees. 
Until then further discussion may be reserved. 

Cold Spring Harbor, New York 



